Associations of genetic and infectious risk factors with coronary heart disease

Coronary heart disease (CHD) is one of the most pressing health problems of our time and a major cause of preventable death. CHD results from complex interactions between genetic and environmental factors. Using multiplex serological testing for persistent or frequently recurring infections and genome-wide analysis in a prospective population study, we delineate the respective and combined influences of genetic variation, infections, and low-grade inflammation on the risk of incident CHD. Study participants are enrolled in the CoLaus|PsyCoLaus study, a longitudinal, population-based cohort with baseline assessments from 2003 through 2008 and follow-up visits every 5 years. We analyzed a subgroup of 3459 individuals with available genome-wide genotyping data and immunoglobulin G levels for 22 persistent or frequently recurring pathogens. All reported CHD events were evaluated by a panel of specialists. We identified independent associations with incident CHD using univariable and multivariable stepwise Cox proportional hazards regression analyses. Of the 3459 study participants, 210 (6.07%) had at least one CHD event during the 12 years of follow-up. Multivariable stepwise Cox regression analysis, adjusted for known cardiovascular risk factors, socioeconomic status, and statin intake, revealed that high polygenic risk (hazard ratio [HR] 1.31, 95% CI 1.10–1.56, p=2.64 × 10−3) and infection with Fusobacterium nucleatum (HR 1.63, 95% CI 1.08–2.45, p=1.99 × 10−2) were independently associated with incident CHD. In a prospective, population-based cohort, high polygenic risk and infection with F. nucleatum have a small, yet independent impact on CHD risk.


Introduction
Worldwide, cardiovascular diseases (CVDs) are the leading cause of mortality (Roth et al., 2018). An estimated 17.9 million people die from CVD each year, accounting for 32% of all deaths. CVD is a broad term for medical conditions involving the heart and blood vessels, such as coronary heart disease (CHD), congenital heart disease, cerebrovascular disease, peripheral arterial disease, rheumatic heart disease, deep vein thrombosis, and pulmonary embolism (World Health Organization, 2009).
CHD is the most common type of heart disease (Roth et al., 2018). It is caused by atherosclerosis, a build-up of plaque inside the walls of the arteries that supply blood to the heart. CHD progresses over a long period of time and eventually evolves into symptoms such as chest pain (angina), tightness in the chest, breathing difficulties, and pain in the arms or shoulders (Ambrose and Singh, 2015). A complete blockage can cause a heart attack.
A combination of demographic, environmental, and genetic factors contribute to the development of CHD (Khot et al., 2003;Mendis et al., 2011). The main risk factors associated with the development of CHD -smoking, diabetes, hyperlipidemia, and hypertension -have been established by extensive epidemiological research (MacMahon et al., 1990;Stamler et al., 1993;Verschuren et al., 1995;Weintraub, 1990). Age is also an important risk factor for CHD (Castelli, 1984). Finally, the incidence of CHD is greater in males than in females (Castelli, 1984). Very recently, a new algorithm, named Systematic COronary Risk Evaluation 2 (SCORE2), was developed to predict the 10-year risk of first-onset CVD in European populations (SCORE2 working group and ESC Cardiovascular risk collaboration, 2021;SCORE2-OP working group and ESC Cardiovascular risk collaboration, 2021). This score has replaced the existing HeartScore scoring system, and incorporates most of the risk factors mentioned above (Conroy et al., 2003).
CHD also has an important genetic component. In 1938, the first familial risk model for CHD was described and later confirmed by clinical observations and large studies of twins and of longitudinal cohorts (Müller, 1938;Marenberg et al., 1994;Samani et al., 2007;Abraham et al., 2016). Based on whole-genome approaches, the heritability of CHD has been estimated at 40-60%, even after controlling for known risk factors (Vinkhuyzen et al., 2013).
Multiple clinical studies have identified inflammatory risk factors that are predictive of future cardiovascular events (Alfaddagh et al., 2020;Libby, 2006;Hansson, 2005). Endothelial dysfunction and subintimal cholesterol have been shown to trigger an inflammatory cascade, involving activated macrophages and leading to atherosclerotic lesions. At the molecular level, inflammasome formation in macrophages plays, through their production of interleukin (IL)-1β, an essential role in the propagation of inflammation. These cytokines are released, trigger various inflammatory cells, and produce IL-6 that in turn, stimulate C-reactive protein (CRP) production by the liver, which further enhances the inflammatory cascade within the vascular wall. Today, CRP is an established biomarker of systemic inflammation and a possible predictor of future cardiovascular events (Libby, 2006).
The recognition of atherosclerosis as an inflammatory disease has renewed interest in examining the role of pathogens in CHD and other CVDs. Nearly 150 years ago, acute infection with Bacillus typhosus was found to cause sclerosing changes in the arterial wall (Gilbert and Lion, 1889). A century later, the interest for a potential role of infection in atherosclerosis was renewed, with the discovery that CHD-positive individuals show an increased likelihood of having elevated levels of antibodies to Chlamydia pneumoniae (C. pneumoniae) (Saikku et al., 1988). This was followed by the discovery of the association with CHD of several other infectious agents, including bacteria and viruses, such as Helicobacter pylori (H. pylori), hepatitis C virus (HCV), and human herpes viruses (Adinolfi et al., 2018;Filardo et al., 2015;Wang et al., 2020;Zhang et al., 2008). The exact mechanisms linking infection to low-grade inflammation and atherosclerosis are still being studied, though some potential pathways have been proposed. One proposed mechanism involves the production of pro-inflammatory molecules in response to an infection (Campbell and Rosenfeld, 2015). These molecules, such as cytokines, can increase the activity of cells involved in atherosclerosis, such as macrophages and smooth muscle cells, leading to the formation of plaques and other changes in the walls of arteries (Campbell and Rosenfeld, 2015). Another mechanism is related to the inflammation at the site of vessel wall. Specifically, it is characterized by the presence of the infectious agents within the atherosclerotic plaques. The infectious consequences on the atherosclerotic plaque can be accelerated progression or a final complication like thrombosis and plaque rupture (Pedicino et al., 2013).
Although enormous progress has been made in the understanding of CHD pathogenesis, the overall picture of the combined contribution of infectious, inflammatory, and genetic factors to the risk of developing CHD in the general population remains incomplete. We here use data from the CoLaus|PsyCoLaus study, a well-characterized, longitudinal, population-based study from Switzerland, to obtain a more comprehensive view of the evidence for the respective contributions of these factors to CHD.

Demographic and serological characteristics
A total of 3459 CoLaus|PsyCoLaus participants with available phenotypic, serological, and genotypic data were included. Their characteristics are presented in Table 1.
We also investigated participants' serostatus for the following 22 F. nucleatum, H. pylori, and S. gallolyticus); and one parasite (T. gondii) (Appendix 1-table 1). The overall seropositivity ranged from 3.99% (S. gallolyticus) to 96.80% (EBV). The overall serostatus split between CHD-positive (with at least one CHD event during follow-up) and CHD-negative individuals are shown in Figure 1. Rubella, C. tetani and C. diphteriae were excluded from further analyses as the antibodies detected against these pathogens were most likely induced by vaccination.
Finally, we calculated a CHD polygenic risk score (CHD-PRS) for each subject to investigate the effect of common human genetic variations on CHD. As expected, we observed a significant association between the PRS and CHD (HR 1.32, 95% CI 1.16-1.51, p = 4.29×10 −5 ), confirming that genetic predisposition to CHD can be captured through CHD-PRS. The top three genetic principal components (PC1, PC2, and PC3) were not significantly associated with CHD (Appendix 1-table 2).

Co-linearity and proportional hazard assumption testing
We calculated pairwise correlations between all variables that were found to be significant in univariable analyses. Appendix 2- figure 1 and Appendix 2-figure 2 illustrate that no strong correlations exist between significant variables. The strongest correlation was observed between SCORE2 and hs-CRP, and between seropositivity to C. trachomatis and gross monthly household income, with Pearson's and Cramer's V coefficients of 0.22 and 0.15, respectively. The proportionality assumption was tested for all significant variables using the Schoenfeld residuals. The residual tests indicated that all variables satisfied the proportional hazards assumption, revealing that the effect of all covariates are constant in time (Appendix 2- figure 3). Finally, we also assessed potential co-linearity issues among predictors that could affect model fitting. No variance inflation factor (VIF) value was indicative of co-linearity.

Multivariable model
To identify the independent risk factors of CHD in our cohort, we performed backward stepwise selection on 2323 individuals with non-missing data using a multivariable Cox proportional hazards model, starting with all the significant factors from the univariable models. The final multivariable analysis confirmed that SCORE2 (HR 1.96 per SD increase, 95% CI 1.74-2.22, p = 2.42×10 −27 ) is an independent prognostic factor of CHD ( Figure 2). We also observed significant independent associations for statin intake (HR 2.24, 95% CI 1.50-3.35, p = 9.17×10 −5 ) and for seropositivity to F. nucleatum infection (HR 1.63, 95% CI 1.08-2.45, p = 1.99×10 −2 ). Comparing individuals who had a least one CHD event (CHD group) against those who had no event during the follow-up period (control group), 22.4% (47/210) of the individuals in the CHD group were seropositive to F. nucleatum, versus 14.6% (473/3249) in the control group (p = 0.003) ( Figure 1, Table 1). Lastly, we also observed a significant association between CHD occurrence and elevated CHD-PRS with an HR of 1.31 (95% CI 1.10-1.56, p = 3.32×10 −3 ) per SD increase.
To assess if the overall burden of infections contributed to increased risk of CHD, study participants were stratified according to their overall seropositivity index for measured pathogens, calculated by summing the number of pathogens for which they show seropositivity (range: 0-16). The numbers of individuals in each pathogen burden stratum are shown in Appendix 2-figure 4. In the univariable Cox model, pathogen burden significantly increased the risk of CHD occurrence (HR 1.11, 95% CI   -table 2). However, after adjustment with multivariable Cox proportional hazards regression, pathogen burden did not meet the level of significance for staying in the model.

Discussion
CHD is a complex disease that is influenced by demographic, environmental, and genetic factors (Khot et al., 2003;Mendis et al., 2011). Infections have also been suspected to increase the risk of CHD, directly or through the induction of chronic inflammation (Vojdani, 2003). The present study investigated the independent and combined effects of these risk factors as possible prognostic indicators for the occurrence of CHD. We performed an event-free survival analysis of incident CHD using data from a longitudinal, population-based study, in which more than 6% of participants developed CHD over a 12-year study period. We confirmed the utility of SCORE2 to predict CHD risk in our cohort (SCORE2 working group and ESC Cardiovascular risk collaboration, 2021). Of note, chronic inflammation reflected in hs-CRP level did not appear as an independent predictor of CHD in our analyses, as the univariable association signal was suppressed after adjustment for SCORE2 levels.
We studied the effect of human genetic determinants on CHD occurrence using PRS, and we reproduced previously observed effects: participants with a higher CHD-PRS have a greater risk of CHD, even after adjustment for all known factors (Ding et al., 2011;Kullo et al., 2016). This result confirms the existence of genetic susceptibility loci for CHD, and that the individual genetic background modulates CHD risk independently from age, sex, or co-morbidities. Our work confirms the potential interest in using PRS to improve the prediction of coronary events.
We also evaluated the potential contribution of multiple persistent or frequently recurring pathogens to CHD after controlling for conventional CHD risk factors, socioeconomic status, and human genetic variability. We observed an association of CHD with detection of antibodies against F. nucleatum. This pathogen is very prevalent in humans (Adams et al., 2004;Afra et al., 2013;Looker et al., 2015). F. nucleatum is an anaerobic bacterium that belongs to the normal flora of the oral cavity and plays an important role in the development and progression of oral diseases, such as gingivitis (gum inflammation) and periodontitis (infection of the gums). Under pathological conditions, the pathogen can spread by the hematogenous route to extra-oral systemic sites, including the gut and the female genital tract (Han and Wang, 2013;Han et al., 2004). Studies have also suggested the involvement of F. nucleatum in CVD. First, by its capacity to directly migrate into arterial plaques, thus exacerbating atherosclerosis, and more recently, through the association of periodontitis and CVD (Kholy et al., 2015;Elkaïm et al., 2008;Figuero et al., 2011;Ford et al., 2006;Han, 2015;Zardawi et al., 2020). Finally, it has been shown that periodontal pathogens are able to spread through the bloodstream from the buccal cavity to the arteries in patients with detectable coronary calcium, a very specific marker of atherosclerosis (Corredor et al., 2022). In summary, the relationship between oral inflammations and CVD could be explained by the colonization of arterial walls and atherosclerosis plaques by dental bacteria, as well as by increased systemic inflammation due to oral infection. However, to date, no direct causality has been established. Besides, no genome-wide association study on F. nucleatum has been published, neither on humoral immune response (i.e., IgG levels) nor on susceptibility to infection/colonization (i.e., serostatus). HSV-1, HHV-6A, VZV, HPyV6, and C. trachomatis serologies, as well as total burden of infection, were associated with CHD occurrence in univariable models. However, these factors were not significantly associated in the multivariable analysis, suggesting that at least some of them could be indirect markers of socioeconomic status.
Our data do not support the existence of the previously identified associations between CHD and H. pylori, or CMV. The conflicting reports of possible associations between these pathogens and CHD could be due to sample size but remain questionable. Further extensive, and high-quality studies are needed to thoroughly examine these associations and provide firm conclusions.
Our study has several limitations. As is the case for most longitudinal studies, the absence of data on individuals who dropped out before the end of the follow-up implies that some CHD events could have gone undetected. Also, the demographic information, as well as the clinical and laboratory measurements, were obtained at baseline, and we do not know whether participant information changed over time. Adjustment for risk factors measured at baseline does not account for clinical or demographic changes that could influence CHD outcomes. Similarly, we do not know how the antibody responses against the various antigens evolved over the 12 years of the study. In addition, no significance adjustment was performed when using multiple univariable tests to determine the effect of single factors on CHD risk, although this may increase the false positive rate. Moreover, we were unable to replicate previously published observations of associations of CHD with C. pneumoniae and HCV as serologies for these pathogens were not available. From a more practical point of view, the identified association with F. nucleatum needs to be replicated and validated in independent cohorts and different populations. Finally, the clinical utility of including genetic and infection biomarkers in CHD prediction algorithms will need to be demonstrated.

Conclusion
CHD is a multicomponent disease that is caused by demographic, environmental, and genetic factors. Inflammation, possibly caused by persistent or frequently recurring infections, can contribute to its development. We identified a statistically significant association between the incidence of CHD and F. nucleatum infection, after adjustment for all established risk factors. We also confirmed that the individual polygenic risk of CVD, calculated from genome-wide genotypes, represents an independent risk factor for incident CHD. Our results can help to better identify subjects at high risk for CHD and provide a rationale for future anti-infective prevention trials.

Study cohort
The CoLaus|PsyCoLaus study is a longitudinal population-based study initiated in Lausanne in 2003; it mainly investigates the biological, environmental, and genetic determinants of CVD (https://www. colaus-psycolaus.ch/) (Firmann et al., 2008). The study involves over 6500 participants of European ancestry, who were recruited at random from the general population and represent approximately 10% sample of Lausanne citizens. Of the participants, 47.5% are men, and age at enrolment ranged from 35 to 75 years (mean ± SD: 51 ± 10.9). The study participants provided detailed phenotypic information through questionnaires, interviews, clinical and biological data. Nuclear deoxyribonucleic acid (DNA) was also extracted from the blood for whole-genome genotyping data. Every 5 years, follow-up interviews on the participants' lifestyle and health status are conducted. There are three completed follow-ups and a fourth follow-up began in January 2022. The institutional Ethics Committee of the University of Lausanne, which later became the Ethics Commission of the Canton Vaud (https://www.cer-vd.ch/), approved the CoLaus|PsyCoLaus study (reference 16/03, decisions of January 13 and February 10, 2003), and all participants gave written consent.

Cardiovascular phenotype
The medical records of the participants who reported a CHD event during their lifetime were collected and evaluated by an independent panel of specialists. Information on the cause of death was also collected prospectively during the study period. The full procedure was described previously (Beuret et al., 2021). Only first events occurring after the baseline and up to day 4500 after the baseline were included in the analysis, as only during this period were all participants reliably followed.

DNA genotyping data and PRS calculation for cardiovascular phenotypes
The BB2 GSK-customized Affymetrix Axiom Biobank array was used to genotype DNA samples from 5399 participants at approximately 800,000 single nucleotide polymorphisms (SNPs). After genotype imputation and quality control procedures, approximately 9 million SNPs were available for analysis (Hodel et al., 2021). We then calculated, based on the risk effects of common SNPs, the CHD-PRS for each study participant. We used validated PRS from Inouye et al., available in the polygenic score catalog (Inouye et al., 2018;Lambert et al., 2021). These scores and summary statistics were used to construct the CHD-PRS in our target cohort data by using the clumping and thresholding method of the PRSice-2 v2.2.7 software (Choi et al., 2020). A standardized method was used to obtain the PRS, by multiplying the risk allele dosage for each variant by the effect size and summing the scores across all selected variants. SNPs were clumped according to linkage disequilibrium (r2 < 0.1) within a 250 kb window.

CHD risk evaluation
The risk of CHD for each participant was also assessed using the very recent SCORE2 and SCORE2-Older Persons (SCORE2-OP, for individuals >65 years of age) algorithms (SCORE2-OP working group and ESC Cardiovascular risk collaboration, 2021; SCORE2 working group and ESC Cardiovascular risk collaboration, 2021). These two algorithms will be referred to as SCORE2. SCORE2 was derived, calibrated, and validated to predict the 10-year risk of first-onset CVD using data from 13 million individuals from >50 European prospective studies and national registries. To develop this algorithm, the authors used competing risk-adjusted and age-and sex-specific models including age, current smoking, systolic blood pressure, and total, low-density lipoprotein (LDL), and high-density lipoprotein (HDL) cholesterol. The authors also defined four risk regions in Europe on the basis of country-specific CVD mortality. For CoLaus|PsyCoLaus participants, calculations were based on the low-risk region corresponding to Switzerland. The raw scores of participants were standardized to Z-scores with approximately zero mean and unit variance before data analysis.

Measurement of inflammatory biomarkers
Venous blood samples (50 mL) of the participants, in a fasted state, were drawn. Before cytokine assessment, the serum blood samples were stored at −80°C, then they were sent to the laboratory on dry ice. The measurements of hs-CRP, IL-1β, IL-6, and TNF-α cytokine levels were described previously in detail (Marques-Vidal et al., 2011). Briefly, hs-CRP levels were assessed by immunoassay and latex HS (IMMULITE 1000-High, Diagnostic Products Corporation, Los Angeles, CA, USA). Cytokine levels were measured using a multiplexed particle-based flow cytometric cytokine assay on the flow cytometer (FC500 MPL, BeckmanCoulter, Nyon, Switzerland), thus following the manufacturer's instructions. The lower limits of detection for IL-1β, IL-6, and TNF-α were 0.2 pg/mL. Intra-and interassay coefficients of variation were, respectively, 15% and 16.7% for IL-1β, 16.9% and 16.1% for IL-6, and 12.5% and 13.5% for TNF-α. For quality control, repeat measurements were performed on 80 subjects randomly selected from the initial sample. Individuals with hs-CRP levels above 20 mg/L were assigned a value of 20 by the manufacturer therefore were removed from the hs-CRP analyses as indicative of acute inflammation.

Serological analyses
To assess the humoral responses to a total of 38 antigens derived from 22 persistent infectious agents, serum samples were analyzed by the Infections and Cancer Epidemiology Division at the German Cancer Research Center (Deutsches Krebsforschungszentrum [DKFZ]) in Heidelberg (Waterboer et al., 2005;Waterboer et al., 2006). Studied pathogens included 15 viruses (BKV, JCV, HPyV6, WUPyV, HSV-1, HSV-2, VZV, EBV, CMV, HHV-6A, HHV-6B, HHV-7, KSHV, PVB-19, and rubella virus); six bacteria (C. diphteriae, C. tetani, C. trachomatis, F. nucleatum, H. pylori, and S. gallolyticus); and one parasite (T. gondii) (for details, see Appendix 1-table 1). The seroreactivity was measured at a serum dilution of 1:1000 by using multiplex serology based on glutathione S-transferase fusion capture immunosorbent assays combined with fluorescent bead technology. For each infectious agent tested, the antibody responses were measured for one to six antigens and then expressed as a binary result (IgG positive or negative), based on the predefined median fluorescence intensity thresholds. To define overall seropositivity against infectious agents when more than one antigen was used, we applied the pathogen-specific algorithms suggested by the manufacturer (see references in Appendix 1-table 1).

Statistical analyses
Univariable and multivariable Cox proportional hazard models were used to explore the relationship between risk factors and CHD incidence in the CoLaus|PsyCoLaus study. Each variable was first screened in the univariable model. To identify potential confounding due to population structure, we also tested the top three genetic principal components (PC1, PC2, and PC3) for association with CHD. We then examined the proportional hazards assumption of the significant (p < 0.05) covariates by using the scaled Schoenfeld residuals. The residuals were plotted over time for each covariate to test for time independence. Risk factors significantly associated with CHD in the univariable model were further evaluated using pairwise correlations. Finally, the identified risk factors were assessed using multivariable stepwise Cox regression analysis, adjusted for competing risk (i.e., SCORE2), socioeconomic status (i.e., gross monthly household income), and statin intake. Potential multicollinearity between statistically significant factors (p < 0.05) were identified using VIFs. The existence of multicollinearity between co-variates was determined by a VIF value > 2. We performed all statistical analyses using R (version 4.2.1). Additional files

Data availability
The data of CoLaus|PsyCoLaus study used in this article cannot be fully shared as they contain potentially sensitive personal information on participants. According to the Ethics Committee for Research of the Canton of Vaud, sharing these data would be a violation of the Swiss legislation with respect to privacy protection. However, coded individual-level data that do not allow researchers to identify participants are available upon request to researchers who meet the criteria for data sharing of the CoLaus|PsyCoLaus Datacenter (CHUV, Lausanne, Switzerland). Any researcher affiliated to a public or private research institution who complies with the CoLaus|PsyCoLaus standards can submit a research application to research.colaus@chuv.ch or research.psycolaus@chuv.ch. Proposals requiring baseline data only, will be evaluated by the baseline (local) Scientific Committee (SC) of the CoLaus and PsyCo-Laus studies. Proposals requiring follow-up data will be evaluated by the follow-up (multicentric) SC of the CoLaus|PsyCoLaus cohort study. Detailed instructions for gaining access to the CoLaus|PsyCoLaus data used in this study are available at https://www.colaus-psycolaus.ch/professionals/how-to-collaborate/. The underlying code used to analyze the data in this manuscript is publicly available on GitHub (https://github.com/flaviahodel/cox-chd-analysis, copy archived at swh:1:rev:317cf14e0fbf09f214b-792cc5ea5a399739e15a1). Source Data files have been provided for all Figures and Figure Supple